Characterisation of scientific and popular science discourse in French, Japanese and Russian

نویسندگان

  • Lorraine Goeuriot
  • Natalia Grabar
  • Béatrice Daille
چکیده

We aim to characterise the comparability of corpora, we address this issue in the trilingual context through the distinction of expert and non expert documents. We work separately with corpora composed of documents from the medical domain in three languages (French, Japanese and Russian) which present an important linguistic distance between them. In our approach, documents are characterised in each language by their topic and by a discursive typology positioned at three levels of document analysis: structural, modal and lexical. The document typology is implemented with two learning algorithms (SVMlight and C4.5). Evaluation of results shows that the proposed discursive typology can be transposed from one language to another, as it indeed allows to distinguish the two aimed discourses (science and popular science). However, we observe that performances vary a lot according to languages, algorithms and types of discursive

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Characterization of Scientific and Popular Science Discourse in French, Japanese and Russian

We aim to characterize the comparability of corpora, we address this issue in the trilingual context through the distinction of expert and non expert documents. We work separately with corpora composed of documents from the medical domain in three languages (French, Japanese and Russian) which present an important linguistic distance between them. In our approach, documents are characterized in...

متن کامل

Language Features of Russian Texts of Engineering Discourse

The Article is devoted to the applied problem of identifying the linguistic features of engineering texts. The study of Russian-language texts of engineering discourse is usually of an applied nature, in our case, this applied research is caused by the need to teach foreigners who receive professional engineering education in Russia and in Russian language. The object of the research is the Rus...

متن کامل

Compilation of Specialized Comparable Corpora in French and Japanese

We present in this paper the development of a specialized comparable corpora compilation tool, for which quality would be close to a manually compiled corpus. The comparability is based on three levels: domain, topic and type of discourse. Domain and topic can be filtered with the keywords used through web search. But the detection of the type of discourse needs a wide linguistic analysis. The ...

متن کامل

Testing Problems in Russian as a Foreign Language in a Technical University

 Problems of theory and practice of the Russian as a foreign language testing for entrants in technical universities are considered. The benefits of test forms for controlling the foreign students’ skills in the Russian language during a hard time limit are presented. The structure and content of the tests, all types of tasks offered on the entrance and final examinations in the Russian languag...

متن کامل

Metadiscourse Use in Popular and Professional Science: The Case of Hedges and Boosters

The present article shows that all scientific texts included in journals, magazines, and newspapers are vulnerable to the penetration of hedges and boosters.  However, it was found that scientific texts in the three corpora tended to open up the possibilities of alternative voices rather than narrowing them down. The relatively higher frequency of occurrence of hedges in comparison with booster...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008